assumption violation
Assumption violations in causal discovery and the robustness of score matching
When domain knowledge is limited and experimentation is restricted by ethical, financial, or time constraints, practitioners turn to observational causal discovery methods to recover the causal structure, exploiting the statistical properties of their data. Because causal discovery without further assumptions is an ill-posed problem, each algorithm comes with its own set of usually untestable assumptions, some of which are hard to meet in real datasets. Motivated by these considerations, this paper extensively benchmarks the empirical performance of recent causal discovery methods on observational data generated under different background conditions, allowing for violations of the critical assumptions required by each selected approach. Our experimental findings show that score matching-based methods demonstrate surprising performance in the false positive and false negative rate of the inferred graph in these challenging scenarios, and we provide theoretical insights into their performance. This work is also the first effort to benchmark the stability of causal discovery algorithms with respect to the values of their hyperparameters. Finally, we hope this paper will set a new standard for the evaluation of causal discovery methods and can serve as an accessible entry point for practitioners interested in the field, highlighting the empirical implications of different algorithm choices.
INPROVF: Leveraging Large Language Models to Repair High-level Robot Controllers from Assumption Violations
Meng, Qian, Zhou, Jin Peng, Weinberger, Kilian Q., Kress-Gazit, Hadas
This paper presents INPROVF, an automatic framework that combines large language models (LLMs) and formal methods to speed up the repair process of high-level robot controllers. Previous approaches based solely on formal methods are computationally expensive and cannot scale to large state spaces. In contrast, INPROVF uses LLMs to generate repair candidates, and formal methods to verify their correctness. To improve the quality of these candidates, our framework first translates the symbolic representations of the environment and controllers into natural language descriptions. If a candidate fails the verification, INPROVF provides feedback on potential unsafe behaviors or unsatisfied tasks, and iteratively prompts LLMs to generate improved solutions. We demonstrate the effectiveness of INPROVF through 12 violations with various workspaces, tasks, and state space sizes.
Internal Incoherency Scores for Constraint-based Causal Discovery Algorithms
Faltenbacher, Sofia, Wahl, Jonas, Herman, Rebecca, Runge, Jakob
Causal discovery aims to infer causal graphs from observational or experimental data. Methods such as the popular PC algorithm are based on conditional independence testing and utilize enabling assumptions, such as the faithfulness assumption, for their inferences. In practice, these assumptions, as well as the functional assumptions inherited from the chosen conditional independence test, are typically taken as a given and not further tested for their validity on the data. In this work, we propose internal coherency scores that allow testing for assumption violations and finite sample errors, whenever detectable without requiring ground truth or further statistical tests. We provide a complete classification of erroneous results, including a distinction between detectable and undetectable errors, and prove that the detectable erroneous results can be measured by our scores. We illustrate our coherency scores on the PC algorithm with simulated and real-world datasets, and envision that testing for internal coherency can become a standard tool in applying constraint-based methods, much like a suite of tests is used to validate the assumptions of classical regression analysis.
Assumption violations in causal discovery and the robustness of score matching
When domain knowledge is limited and experimentation is restricted by ethical, financial, or time constraints, practitioners turn to observational causal discovery methods to recover the causal structure, exploiting the statistical properties of their data. Because causal discovery without further assumptions is an ill-posed problem, each algorithm comes with its own set of usually untestable assumptions, some of which are hard to meet in real datasets. Motivated by these considerations, this paper extensively benchmarks the empirical performance of recent causal discovery methods on observational iid data generated under different background conditions, allowing for violations of the critical assumptions required by each selected approach. Our experimental findings show that score matching-based methods demonstrate surprising performance in the false positive and false negative rate of the inferred graph in these challenging scenarios, and we provide theoretical insights into their performance. This work is also the first effort to benchmark the stability of causal discovery algorithms with respect to the values of their hyperparameters.
Automated Robot Recovery from Assumption Violations of High-Level Specifications
Meng, Qian, Kress-Gazit, Hadas
This paper presents a framework that enables robots to automatically recover from assumption violations of high-level specifications during task execution. In contrast to previous methods relying on user intervention to impose additional assumptions for failure recovery, our approach leverages synthesis-based repair to suggest new robot skills that, when implemented, repair the task. Our approach detects violations of environment safety assumptions during the task execution, relaxes the assumptions to admit observed environment behaviors, and acquires new robot skills for task completion. We demonstrate our approach with a Hello Robot Stretch in a factory-like scenario.
Preemptive Detection of Unsafe Motion Liable for Hazard
Nishi, Masataka (Hitachi Research Laboratory Hitachi Ltd)
Establishing a safety standard for autonomous vehicles operating in open and dynamic environment is a challenge. As collisions are inevitable in over-constrained situations, we focus on deciding the liability for a hazard. Our insight is that hazards caused by malfunctions of autonomous vehicles result from loss of functional integrity. Design defects may leave it unnoticed, or the real-world may make integritypreserving motion infeasible. Guarantee of functional integrity in an observable way at run-time is indispensable for revealing defects by using formal root-cause analysis, and for supporting safety claims by dismissing unreasonable doubts about design defects. From a practitical standpoint, we attempt to formalize a verification problem that consists of a novel criterion for determining liability for hazard, a safety claim comprised of confirmed observable states, and assumptions underlying the safety claim. We propose a run-time scheme of monitoring events that may lead to violations of the assumptions and a precursor to root-causes leading to loss of functional integrity and consequent hazards. We formulate a means of preemptively detecting unsafe motions liable to be hazardous as satisfiability problem within the framework of an adversarial motion planning subject to assumptions on maneuverability of movers. A numerical study shows that the run-time scheme using non-linear programming (NLP) encoding is viable in a real-world setting.